102 research outputs found

    Entity-Centric Stream Filtering and Ranking: Filtering and Unfilterable Documents

    Get PDF
    Cumulative Citation Recommendation (CCR) is defined as: given a stream of documents on one hand and Knowledge Base (KB) entities on the other, filter, rank and recommend citation-worthy documents. The pipeline encountered in systems that approach this problem involves four stages: filtering, classification, ranking (or scoring), and evaluation. Filtering is only an initial step that reduces the web-scale corpus into a working set of documents more manageable for the subsequent stages. Nevertheless, this step has a large impact on the recall that can be at- tained maximally. This study analyzes in-depth the main factors that affect recall in the filtering stage. We investigate the impact of choices for corpus cleansing, entity profile construction, entity type, document type, and relevance grade. Because failing on recall in this first step of the pipeline cannot be repaired later on, we identify and characterize the citation-worthy documents that do not pass the filtering stage by examining their contents

    Entity-Centric Stream Filtering and Ranking: Filtering and Unfilterable Documents

    Get PDF
    Cumulative Citation Recommendation (CCR) is defined as: given a stream of documents on one hand and Knowledge Base (KB) entities on the other, filter, rank and recommend citation-worthy documents. The pipeline encountered in systems that approach this problem involves four stages: filtering, classification, ranking (or scoring), and evaluation. Filtering is only an initial step that reduces the web-scale corpus into a working set of documents more manageable for the subsequent stages. Nevertheless, this step has a large impact on the recall that can be attained maximally. This study analyzes in-depth the main factors that affect recall in the filtering stage. We investigate the impact of choices for corpus cleansing, entity profile construction, entity type, document type, and relevance grade. Because failing on recall in this first step of the pipeline cannot be repaired later on, we identify and characterize the citation-worthy documents that do not pass the filtering stage by examining their contents

    Entity-Centric Stream Filtering and Ranking: Filtering and Unfilterable Documents

    Get PDF
    htmlabstractCumulative Citation Recommendation (CCR) is defined as: given a stream of documents on one hand and Knowledge Base (KB) entities on the other, filter, rank and recommend citation-worthy documents. The pipeline encountered in systems that approach this problem involves four stages: filtering, classification, ranking (or scoring), and evaluation. Filtering is only an initial step that reduces the web-scale corpus into a working set of documents more manageable for the subsequent stages. Nevertheless, this step has a large impact on the recall that can be at- tained maximally. This study analyzes in-depth the main factors that affect recall in the filtering stage. We investigate the impact of choices for corpus cleansing, entity profile construction, entity type, document type, and relevance grade. Because failing on recall in this first step of the pipeline cannot be repaired later on, we identify and characterize the citation-worthy documents that do not pass the filtering stage by examining their contents

    Random performance differences between online recommender system algorithms

    Get PDF
    In the evaluation of recommender systems, the quality of recommendations made by a newly proposed algorithm is compared to the state-of-the-art, using a given quality measure and dataset. Validity of the evaluation depends on the assumption that the evaluation does not exhibit artefacts resulting from the process of collecting the dataset. The main difference between online and offline evaluation is that in the online setting, the user’s response to a recommendation is only observed once. We used the NewsREEL challenge to gain a deeper understanding of the implications of this difference for making comparisons between different recommender systems. The experiments aim to quantify the expected degree of variation in performance that cannot be attributed to differences between systems. We classify and discuss the non-algorithmic causes of performance differences observed

    Cumulative Citation Recommendation: A Feature-aware Comparisons of Approaches

    Get PDF
    In this work, we conduct a feature-aware comparison of approaches to Cumulative Citation Recommendation (CCR), a task that aims to filter and rank a stream of documents according to their relevance to entities in a knowledge base. We conducted experiments starting with a big feature set, identified a powerful subset and applied it to comparing classification and learning to rank algorithms. With few set of powerful features, we achieve better performance than the state-of-the-art. Surprisingly, our findings challenge the previously known preference of learning-to-rank over classification: in our study, the CCR performance of the classification approach outperforms that using learning-to-rank. This indicates that comparing two approaches is problematic due to the interplay between the approaches themselves and the feature sets one chooses to use

    CWI at TREC 2012, KBA track and Session Track

    Get PDF
    We participated in two tracks: Knowledge Base Acceleration (KBA) Track and Session Track. In the KBA track, we focused on experi- menting with different approaches as it is the first time the track is launched. We experimented with supervised and unsupervised re- trieval models. Our supervised approach models include language models and a string-learning system. Our unsupervised approaches include using: 1)DBpedia labels and 2) Google-Cross-Lingual Dic- tionary (GCLD). While the approach that uses GCLD targets the central and relvant bins, all the rest target the central bin. The GCLD and the string-learning system have outperformed the oth- ers in their respective targeted bins. The goal of the Session track submission is to evaluate whether and how a logic framework for representing user interactions with an IR system can be used for improving the approximation of the relevant term distribution that another system that is supposed to have access to the session infor- mation will then calculate. the documents in the stream corpora. Three out of the seven runs used a Hadoop cluster provide by Sara.nl to process the stream cor- pora. The other 4 runs used a federated access to the same corpora distributed among 7 workstations

    CWI and TU Delft at TREC 2013: Contextual Suggestion, Federated Web Search, KBA, and Web Tracks

    Get PDF
    This paper provides an overview of the work done at the Centrum Wiskunde & Informatica (CWI) and Delft University of Technology (TU Delft) for different tracks of TREC 2013. We participated in the Contextual Suggestion Track, the Federated Web Search Track, the Knowledge Base Acceleration (KBA) Track, and the Web Ad-hoc Track. In the Contextual Suggestion track, we focused on filtering the entire ClueWeb12 collection to generate recommendations according to the provided user profiles and contexts. For the Federated Web Search track, we exploited both categories from ODP and document relevance to merge result lists. In the KBA track, we focused on the Cumulative Citation Recommendation task where we exploited different features to two classification algorithms. For the Web track, we extended an ad-hoc baseline with a proximity model that promotes documents in which the query terms are positioned closer together

    Global, regional, and national burden of chronic kidney disease, 1990–2017 : a systematic analysis for the Global Burden of Disease Study 2017

    Get PDF
    Background Health system planning requires careful assessment of chronic kidney disease (CKD) epidemiology, but data for morbidity and mortality of this disease are scarce or non-existent in many countries. We estimated the global, regional, and national burden of CKD, as well as the burden of cardiovascular disease and gout attributable to impaired kidney function, for the Global Burden of Diseases, Injuries, and Risk Factors Study 2017. We use the term CKD to refer to the morbidity and mortality that can be directly attributed to all stages of CKD, and we use the term impaired kidney function to refer to the additional risk of CKD from cardiovascular disease and gout. Methods The main data sources we used were published literature, vital registration systems, end-stage kidney disease registries, and household surveys. Estimates of CKD burden were produced using a Cause of Death Ensemble model and a Bayesian meta-regression analytical tool, and included incidence, prevalence, years lived with disability, mortality, years of life lost, and disability-adjusted life-years (DALYs). A comparative risk assessment approach was used to estimate the proportion of cardiovascular diseases and gout burden attributable to impaired kidney function. Findings Globally, in 2017, 1·2 million (95% uncertainty interval [UI] 1·2 to 1·3) people died from CKD. The global all-age mortality rate from CKD increased 41·5% (95% UI 35·2 to 46·5) between 1990 and 2017, although there was no significant change in the age-standardised mortality rate (2·8%, −1·5 to 6·3). In 2017, 697·5 million (95% UI 649·2 to 752·0) cases of all-stage CKD were recorded, for a global prevalence of 9·1% (8·5 to 9·8). The global all-age prevalence of CKD increased 29·3% (95% UI 26·4 to 32·6) since 1990, whereas the age-standardised prevalence remained stable (1·2%, −1·1 to 3·5). CKD resulted in 35·8 million (95% UI 33·7 to 38·0) DALYs in 2017, with diabetic nephropathy accounting for almost a third of DALYs. Most of the burden of CKD was concentrated in the three lowest quintiles of Socio-demographic Index (SDI). In several regions, particularly Oceania, sub-Saharan Africa, and Latin America, the burden of CKD was much higher than expected for the level of development, whereas the disease burden in western, eastern, and central sub-Saharan Africa, east Asia, south Asia, central and eastern Europe, Australasia, and western Europe was lower than expected. 1·4 million (95% UI 1·2 to 1·6) cardiovascular disease-related deaths and 25·3 million (22·2 to 28·9) cardiovascular disease DALYs were attributable to impaired kidney function. Interpretation Kidney disease has a major effect on global health, both as a direct cause of global morbidity and mortality and as an important risk factor for cardiovascular disease. CKD is largely preventable and treatable and deserves greater attention in global health policy decision making, particularly in locations with low and middle SDI

    Prevalence and attributable health burden of chronic respiratory diseases, 1990–2017 : a systematic analysis for the Global Burden of Disease Study 2017

    Get PDF
    Background Previous attempts to characterise the burden of chronic respiratory diseases have focused only on specific disease conditions, such as chronic obstructive pulmonary disease (COPD) or asthma. In this study, we aimed to characterise the burden of chronic respiratory diseases globally, providing a comprehensive and up-to-date analysis on geographical and time trends from 1990 to 2017. Methods Using data from the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2017, we estimated the prevalence, morbidity, and mortality attributable to chronic respiratory diseases through an analysis of deaths, disability-adjusted life-years (DALYs), and years of life lost (YLL) by GBD super-region, from 1990 to 2017, stratified by age and sex. Specific diseases analysed included asthma, COPD, interstitial lung disease and pulmonary sarcoidosis, pneumoconiosis, and other chronic respiratory diseases. We also assessed the contribution of risk factors (smoking, second-hand smoke, ambient particulate matter and ozone pollution, household air pollution from solid fuels, and occupational risks) to chronic respiratory disease-attributable DALYs. Findings In 2017, 544.9 million people (95% uncertainty interval [UI] 506.9- 584.8) worldwide had a chronic respiratory disease, representing an increase of 39.8% compared with 1990. Chronic respiratory disease prevalence showed wide variability across GBD super-regions, with the highest prevalence among both males and females in high-income regions, and the lowest prevalence in sub-Saharan Africa and south Asia. The age-sex- specific prevalence of each chronic respiratory disease in 2017 was also highly variable geographically. Chronic respiratory diseases were the third leading cause of death in 2017 (7.0% [95% UI 6.8-7 .2] of all deaths), behind cardiovascular diseases and neoplasms. Deaths due to chronic respiratory diseases numbered 3 914 196 (95% UI 3 790 578-4 044 819) in 2017, an increase of 18.0% since 1990, while total DALYs increased by 13.3%. However, when accounting for ageing and population growth, declines were observed in age-standardised prevalence (14.3% decrease), agestandardised death rates (42.6%), and age-standardised DALY rates (38.2%). In males and females, most chronic respiratory disease-attributable deaths and DALYs were due to COPD. In regional analyses, mortality rates from chronic respiratory diseases were greatest in south Asia and lowest in sub-Saharan Africa, also across both sexes. Notably, although absolute prevalence was lower in south Asia than in most other super-regions, YLLs due to chronic respiratory diseases across the subcontinent were the highest in the world. Death rates due to interstitial lung disease and pulmonary sarcoidosis were greater than those due to pneumoconiosis in all super-regions. Smoking was the leading risk factor for chronic respiratory disease-related disability across all regions for men. Among women, household air pollution from solid fuels was the predominant risk factor for chronic respiratory diseases in south Asia and sub-Saharan Africa, while ambient particulate matter represented the leading risk factor in southeast Asia, east Asia, and Oceania, and in the Middle East and north Africa super-region. Interpretation Our study shows that chronic respiratory diseases remain a leading cause of death and disability worldwide, with growth in absolute numbers but sharp declines in several age-standardised estimators since 1990. Premature mortality from chronic respiratory diseases seems to be highest in regions with less-resourced health systems on a per-capita basis

    Epidemiology of injuries from fire, heat and hot substances : global, regional and national morbidity and mortality estimates from the Global Burden of Disease 2017 study

    Get PDF
    Background Past research has shown how fires, heat and hot substances are important causes of health loss globally. Detailed estimates of the morbidity and mortality from these injuries could help drive preventative measures and improved access to care. Methods We used the Global Burden of Disease 2017 framework to produce three main results. First, we produced results on incidence, prevalence, years lived with disability, deaths, years of life lost and disability-adjusted life years from 1990 to 2017 for 195 countries and territories. Second, we analysed these results to measure mortality-to-incidence ratios by location. Third, we reported the measures above in terms of the cause of fire, heat and hot substances and the types of bodily injuries that result. Results Globally, there were 8 991 468 (7 481 218 to 10 740 897) new fire, heat and hot substance injuries in 2017 with 120 632 (101 630 to 129 383) deaths. At the global level, the age-standardised mortality caused by fire, heat and hot substances significantly declined from 1990 to 2017, but regionally there was variability in age-standardised incidence with some regions experiencing an increase (eg, Southern Latin America) and others experiencing a significant decrease (eg, High-income North America). Conclusions The incidence and mortality of injuries that result from fire, heat and hot substances affect every region of the world but are most concentrated in middle and lower income areas. More resources should be invested in measuring these injuries as well as in improving infrastructure, advancing safety measures and ensuring access to care.Peer reviewe
    corecore